SGI Developer Toolbox 6.1

home *** CD-ROM | disk | FTP | other *** search

/ SGI Developer Toolbox 6.1 / SGI Developer Toolbox 6.1 - Disc 4.iso / public / rsynth / src / README_klatt < prev next >

Wrap

Text File | 1994-08-01 | 6KB | 219 lines

Klatt Cascade / Parallel Formant Synthesizer -------------------------------------------- History ------- This file contains a version of the Klatt Cascade / Parallel Formant Speech Synthesizer. The software for this synthesizer was originally described in (1) and an updated version of the software was described in (2). An up to date version of the software synthesizer as described in (2) is commercially available from Sensimetrics. (3) The code contained within this directory is a translation of the original Fortran, into C, by Dennis Klatt. In terms of the two articles referred to above, this version is the mid point of the development between the two systems described. Modifications ------------- The main part of the code in this directory was posted to comp.speech in early 1993 as part of a crude text to speech conversion system. The code taken from comp.speech seemed to have been modified considerably from the original, and for use of the synthesizer in research it was necessary to "fix" the changes that had been made. The major changes I have made are: 1. Re-introduced the parallel-only / cascade-parallel switch. This allows choice of synthesis method, either using both branches, or just using the parallel branch. 2. Correct use of bandwidth parameters. One of the cascade bandwidth parameters was being wrongly used in the parallel branch of the synthesizer. 3. Correct operation of natural voicing source. The amplitude of the natural voicing source was very much smaller than the amplitude of the impulse source, making it difficult to swap between them to evaluate the differences. 4. Removed the software synthesizer from the context of a text to speech system. The synthesizer is now a stand-alone program, accepting input as a set of parameters from a file, and allowing output to a file or to stdout. 5. Added command line options to control the parameters that remain constant during synthesis. 6. Added F0 flutter control, as described in (2). Input File Format ----------------- The input file consists of a series of parameter frames. Each frame of parameters (usually) represents 10ms of audio output. The parameters in each frame are described below. To avoid confusion, note that the cascade and parallel branch of the synthesizer duplicate some of the control parameters. f0 This is the fundamental frequency (pitch) of the utterance in this case it is specified in steps of 0.1 Hz, hence 100Hz will be represented by a value of 1000. av Amplitude of voicing for the cascade branch of the synthesizer in dB0. Range 0-80, value usually 60 for a vowel sound. f1 First formant frequency in Hz. b1 Cascade branch bandwidth of first formant in Hz. f2 Second formant frequency in Hz. b2 Cascade branch bandwidth of first formant in Hz. f3 Third formant frequency in Hz. b3 Cascade branch bandwidth of first formant in Hz. f4 Fourth formant frequency in Hz. b4 Cascade branch bandwidth of first formant in Hz. f5 Fifth formant frequency in Hz. b5 Cascade branch bandwidth of first formant in Hz. f6 Sixth formant frequency in Hz. b6 Cascade branch bandwidth of first formant in Hz. fnz Frequency of the nasal pole-zero in Hz (cascade branch only) bnz Bandwidth of the nasal pole-zero in Hz (cascade branch only) fnp Frequency of the nasal pole in Hz (cascade branch only) bnp Bandwidth of the nasal pole in Hz (cascade branch only) ap Amplitude of aspiration. kopen Open quotient of voicing waveform, range 0-60, usually 30. aturb Amplitude of turbulence 0-80. A value of 40 is fairly useful. tilt Spectral tilt af Amplitude of frication in dB, range 0-80 (parallel branch) skew Spectral Skew a1 Amplitude of first formant in the parallel branch, in dB. Range 0-80. b1p Bandwidth of the first formant in the parallel branch, in Hz. a2 Amplitude of parallel branch second formant. b2p Bandwidth of parallel branch second formant. a3 Amplitude of parallel branch third formant. b3p Bandwidth of parallel branch third formant. a4 Amplitude of parallel branch fourth formant. b4p Bandwidth of parallel branch fourth formant. a5 Amplitude of parallel branch fifth formant. b5p Bandwidth of parallel branch fifth formant. a6 Amplitude of parallel branch sixth formant. b6p Bandwidth of parallel branch sixth formant. anp ab avp Amplitude of voicing for the parallel branch gain gain in dB's range 0-80, 50 is a useful value. Command Line Options -------------------- -h Displays a help message. -i <filename> sets input filename. -o <outfile> sets output filename. If output filename not specified, stdout is used. -q quiet - print no messages. -t <n> select output waveform (RTFC !) -c select cascade-parallel configuration. Parallel only configuration is default. -n <number> Number of formants in cascade branch. Default is 5. -s <n> set sample rate Default is 10Khz. -f <n> set number of milliseconds per frame. Default is 10ms per frame -v Specifies that the impulse voicing source is used. Default is natural voicing -F <percent> percentage of f0 flutter Default is 0\n References ---------- (1) @article{klatt1980, AUTHOR = {Klatt,D.H.}, JOURNAL = {Journal of the Acoustic Society of America}, PAGES = {971--995}, TITLE = {Software for a cascade/parallel formant synthesizer}, VOLUME = {67}, NUMBER = {3}, MONTH = {March}, YEAR = 1980} (2) @Article{klatt1990, author = "Klatt,D.H. and Klatt, L.C.", title = "Analysis, synthesis and perception of voice quality variations among female and male talkers.", journal = "Journal of the Acoustical Society of America", year = "1990", volume = "87", number = "2", pages = "820--857", month = "February"} (3) Dr. David Williams at Sensimetrics Corporation, 64 Sidney Street, Cambridge, MA 02139. Fax: (617) 225-0470 Tel: (617) 225-2442 e-mail sensimetrics@sens.com